Automatic speech recognition using dynamic bayesian networks with both acoustic and articulatory variables
نویسندگان
چکیده
Current technology for automatic speech recognition (ASR) uses hidden Markov models (HMMs) that recognize spoken speech using the acoustic signal. However, no use is made of the causes of the acoustic signal: the articulators. We present here a dynamic Bayesian network (DBN) model that utilizes an additional variable for representing the state of the articulators. A particular strength of the system is that, while it uses measured articulatory data during its training, it does not need to know these values during recognition. As Bayesian networks are not used often in the speech community, we give an introduction to them. After describing how they can be used in ASR, we present a system to do isolated word recognition using articulatory information. Recognition results are given, showing that a system with both acoustics and inferred articulatory positions performs better than a system with only acoustics.
منابع مشابه
Hidden feature models for speech recognition using dynamic Bayesian networks
In this paper, we investigate the use of dynamic Bayesian networks (DBNs) to explicitly represent models of hidden features, such as articulatory or other phonological features, for automatic speech recognition. In previous work using the idea of hidden features, the representation has typically been implicit, relying on a single hidden state to represent a combination of features. We present a...
متن کاملIntegration of articulatory dynamic parameters in HMM/BN based speech recognition system
In this paper, we describe several approaches to integration of the articulatory dynamic parameters along with articulatory position data into a HMM/BN model based automatic speech recognition system. This work is a continuation of our previous study, where we have successfully combined speech acoustic features in form of MFCC with articulatory position observations. Articulatory dynamic parame...
متن کاملProduction Knowledge in the Recognition of Dysarthric Speech
Production knowledge in the recognition of dysarthric speech Frank Rudzicz Doctor of Philosophy Graduate Department of Department of Computer Science University of Toronto 2011 Millions of individuals have acquired or have been born with neuro-motor conditions that limit the control of their muscles, including those that manipulate the articulators of the vocal tract. These conditions, collecti...
متن کاملSpeech Recognition with Dynamic Bayesian Networks
Dynamic Bayesian networks (DBNs) are a useful tool for representing complex stochastic processes. Recent developments in inference and learning in DBNs allow their use in real-world applications. In this paper, we apply DBNs to the problem of speech recognition. The factored state representation enabled by DBNs allows us to explicitly represent long-term articulatory and acoustic context in add...
متن کاملIntegration of articulatory and spectrum features based on the hybrid HMM/BN modeling framework
Most of the current state-of-the-art speech recognition systems are based on speech signal parametrizations that crudely model the behavior of the human auditory system. However, little or no use is usually made of the knowledge on the human speech production system. A data-driven statistical approach to incorporate this knowledge into ASR would require a substantial amount of data, which are n...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000